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the issue of uncertainties in the folding period. The final results of our analysis are 
expressed in the form of posterior probability distributions on the signal parameters 
(including the TOA) from a single observation. 
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1. Introduction 

Pulsar timing arrays could well be used to detect ultra-low frequency gravitational waves 
(GWs) within the next 5-10 years pQ. This is an especially exciting prospect given the 
concurrent efforts of the LIGO- Virgo Scientific collaboration (LVC) whose aim is to 
make direct detection of GWs (in the ~10-1000 Hz regime) using the 2 nd generation of 
ground based interferometric detectors within the same timescale [2]. 

In this work we outline the beginnings of a Bayesian approach to the detection of 
GWs with pulsar timing using simplistic signal and noise models onto which can be 
built further levels of sophistication in the future. A key long-term aim of our analysis 
is to improve our ability to time existing millisecond pulsars by a factor of 3-10 [3j H]. 
One of the main problems to be overcome is to be able to sensibly account for the excess 
low- frequency noise seen in many stable millisecond pulsars [5J. We focus on a single 
piece of the complete pulsar timing analysis, the generation of time-of-arrival (TOA) 
measurements. Given a single pulsar observation^, this is the arrival time of the average 
pulse at the telescope where in this context "average" means the sum of pulses produced 
by "folding" the data with a periodicity equal to the assumed pulse period. It is from 
these TOAs that pulsar astronomers then model the spin evolution of pulsars taking 
into account the motion of the radio telescope relative to the pulsar |Bj. The presence 
of GWs in the field between the telescope and the pulsar will result in small shifts in 
the arrival times of pulses [H |8] . 

We choose to limit our investigation to single pulsar observations (typically 
100 — 1000s seconds in duration) and since TOAs are defined in the reference frame 
of the telescope and the GW timescale 3> the timescale of a single observation, we are 
able to neglect any GW effect in our analysis. We will discuss two different strategies 
for the estimation of parameters (including the TOA) from two separate starting points, 
what we will call "search-mode" data and "pre-folded" data. In both cases we perform 
the analysis using a commonly used Bayesian integration algorithm in order to obtain 
posterior probability distributions on the signal parameters. 

We note that our approach is aimed as a starting point for future more realistic 
scenarios and that it can be viewed as an approach being built from the bottom-up. We 
mean this in the sense that we try to start from the most basic datasets available (see 
Sees. [3] and [4]) and attempt to build a data-analysis framework in which the multitude 
of physical processes affecting pulsar signals can be included and accounted for. In 
contrast, other work on the specifics of GW detection using pulsar timing arrays has 
taken a more top-down approach. These analyses have started with timing residuals, 
the result of a fit to the data assuming non GW effects (effectively the end of the pulsar 
data processing chain), and either neglected this potential inconsistency 0, [10] or made 
attempts to account for it [TT] . 

The paper is organised as follows. In Sec. [2] we describe our simplistic signal and 

% We discuss in Sec. [6] that while TOAs are associated with individual pulsar observations (or subsets 
of an observation), in general a given TOA will depend on parameters "fit" to previous observations. 
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noise model. In Sees. [3] and [4] we then go on to describe the form of this signal model 
in two different representations of the original dataset. The basic concepts concerning 
our Bayesian approach to the parameter estimation problem can be found in Sec. [5] and 
finally we discuss our conclusions and potential future developments in Sec. [6j 

2. The signal : A toy model 

We begin with a dataset defined on a discrete 2-dimensional grid of time tj versus radio- 
frequency fk of which an example is shown in Fig. [TJ Data of this kind is often referred 
to as "search-mode" data since this data format is the kind used when performing 
searches for unknown pulsars. Each of the M rows of the 2-dimensional grid is a 
timeseries of radio-frequency power measured within a radio-frequency band with central 
frequency fk- Typical sampling times and observation durations are ~ 10s of /^seconds 
and 100 — 1000s of seconds respectively. Typical frequency channel widths and total 
detector bandwidths are ~ 1 MHz and 100 — 1000s MHz respectively. For "search-mode" 
data we assume the following signal model 

x (tj, fk) = s(tj, f k ) + n(tj, fk), (1) 

where x(tj,f k ) represents the discretely sampled dataset, s(tj,f k ) is the signal and 
n(tj, fk) is the noise which for simplicity we assume as independent Gaussian distributed 
random variables with zero mean. The signal itself we define as 

n'-l 

s (tjJk) = ^Aexp 

o=0 

where a sums over all n' pulses that intersect with the observation^} We use A as 
the pulse peak amplitude, w as the pulse width, and \i a k as the centre of the a'th 
pulse in the fc'th frequency channel. Note that we are modelling each pulse as having a 
single Gaussian profile component and that the amplitude and width remain constant 
in both time and with frequency channel. The inclusion of additional Gaussian pulse 
components requires only a trivial modification to the model. In Sec. [6] we discuss 
numerous potential additions and modifications required to make this toy model a more 
accurate representation of a real pulsar signal. 

The time at the centre of each pulse is defined as 

Hak = (0fc + a)P + £ a , (3) 

where P is the constant pulse period and 4>k is the phase (defined on the range [0, 1)) of 
the first pulse in the observation for the fc'th frequency channel. We have also included 
a random pulse "jitter" term where for each pulse we apply a random shift to the 

§ Due to the effects of dispersion, whilst the pulse period is equal in all frequency channels, a particular 
pulse near the end of the timeseries for a high-frequency channel can be delayed by dispersion such that 
it does not intersect with the observation at a lower frequency channel. The same applies to pulses near 
to the start of the timeseries in a lower frequency channel since they may arrive before the observation 
in a higher frequency channel. 



(tj f^ak) 

2w 2 



(2) 
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pulse arrival time where each shift £ a is drawn from a Gaussian distribution with zero 
mean and variance cr|. Such effects have been observed in several pulsars and can be 
attributed to unknown processes in the pulse emission mechanism and possibly related 
to giant pulses [12j H31 EH EE US]. We show in Sec. [3]that for our purposes, the effect of 
this particular pulse "jitter" modelling can be absorbed into a subset of the other signal 
parameters. 

The phase of the first pulse in each frequency channel <pk can be related to 
the phase $o, defined as the phase of the pulse at the midpoint frequency channel 
/mid = Wm ~ /i)/2 and with reference to the midpoint of the observation t = T/2 by 

fc = mod(^ + ^ + $ o ,l). (4) 

The relative delay due to dispersion in the fc'th frequency channel At(fk) is given by 

At(f k ) = 4.148808 x 10 3 (f^ 2 - / mi 2 d ) DM sec, (5) 

where DM is the dispersion measure in cm _3 pc and the units of the frequencies are 
MHz. Note that in our simplistic model we do not account for dispersion smearing 
within individual channels. 



3. Using "search-mode" data in the Fourier domain 



The signals received from pulsars are periodic and their frequency evolution is slow 
i.e. the timescale of frequency variation ^> the pulse period. By Fourier transforming 
each channel's time-series we find that a pulsar signal can be represented as a series 
of narrow-band harmonics as shown in Fig. [2j In a realistic situation we would expect 
to have some prior knowledge of the pulsar frequency before performing an analysis 
and therefore transforming the data in such a way allows us to isolate the regions in the 
dataset where the signal is concentrated (at the harmonics). This in-turn allows us to be 
economic with the data samples that we are interested in and will make any numerical 
likelihood computation more efficient. Let us define the discrete Fourier transform as 

AT-l 

x(u l ) = Y,<tj)e- 27Tjl/N At, (6) 

j=0 

where v\ represents the elements of a vector containing the positive discrete Fourier 
frequencies^]}] with frequency spacing 1/T and where N is the number of time samples. 
When applied to the time series from each frequency channel of a noise-free signal 
(defined by Eqs. 2|3) we obtain 



s{ v h fk) 



N-l n'-l 
j=0 a=0 



(tj H>ak) 

2u7 2 



e -2njl/N At 



22 S »( U hfk 



(7) 



a=0 



|| Note that there is a clear distinction between the radio frequencies (or frequency channels) and the 
Fourier frequencies. 
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Figure 1. A example of time vs frequency channel "search-mode" data showing a 
portion of a simulated dataset consisting of a strong signal in Gaussian noise. Here we 
show only the first 0.05 seconds of data (At = 64 //sec) for 8 1-MHz wide frequency 
channels. The signal has an amplitude A = 5, pulse width w = 0.25 msec, period 
P = 5 msec, a phase <&o = 0.2, and a dispersion measure DM = 100cm _3 pc. The 
noise has unit variance. 



where we have decomposed the complete Fourier transform into the Fourier transform 
of each pulse. We then have 



N-l 



3=0 



(tj f^ak) 
2^ 



- 2njl/N 



/oo 
exp 
-oo 



2w 2 



At, 



2-niv t y } dy 



= AwV2tt exp {—2-Kiviiiak — 2-k 2 w 2 v1 } , (8) 

where we have approximated the discrete sum over time samples with the continuous 
integral over the dummy variable y = t — fi a k assuming that each pulse itself spans 3> 1 
time bin and is not truncated by the edges of the timeseries^J We can now perform the 
sum over a (the individual pulses) to obtain the complete Fourier transform. However, 

% Clearly, of the n' pulses that intersect the time-frequency plane there will be some frequency channels 
for which a pulse does not appear in the timeseries due to dispersion. Equation [8] is therefore only 
applicable to those pulses in a particular frequency channel that are found to intersect the timeseries. 
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note that jj, a k is a function of £ Q , the random individual pulse arrival time jitter. We 
choose to average over this random variable under the assumption that there are a large 
number of pulses within the observation time. This averaging procedure leads to the 
following replacement: 



/oo -KTI^l 



= e-^M, (9) 

where we have replaced the pulse jitter term with its expectation value and used a 
Gaussian distribution for the pulse jitter with a zero mean and variance of cr|. Finally 
we obtain the following expression for the Fourier transform of the signal only timeseries 

§K fk) = ^^v^exp {-2n 2 v?wl} exp {-2^ k v x } W h (10) 

We can see from this equation that in the Fourier domain the signal can be decomposed 
into four parts. There is a real positive amplitude term proportional to the pulse 
amplitude, width, and number of pulses (n ~ T/P) which is multiplied by a frequency 
dependent envelope function that decays with increasing frequency at a rate proportional 
to the pulse width. There is also a unit amplitude complex phase term dependent upon 
the initial phase of the pulse in the given frequency channel multiplied by a second 
complex phase term W\ given by 



P 

W t = — exp {2-ixiViP} 



1 — exp {—2nii>iT} 



exp {2nii>iP} — 1 
which, in the limit of T ^> P can be written as 

cos(27rAz/; i gT) — 1 



^ I 2nAu lfj T 



2nAu ip T 



(12) 



where Auip — v\ — f3/P and (3 labels the individual signal harmonics of which there are 
n. This final complex phase term contains the information regarding the location and 
phase of the signal harmonics. We can now see that each signal harmonic is identical in 
shape but will each have a different phase and amplitude. In addition, as one moves to 
different frequency channels the phase of a given harmonic will be rotated by a quantity 
dependent upon the dispersion measure. 

Note that we have also re-parameterised the pulse amplitude and width using 

A t = (13) 
w 2 + cr| 



w^ = ^w 2 + al, (14) 

since with the addition of pulse jitter there exists a degeneracy between the original pulse 
amplitude and width. The product of the amplitude and width determine the overall 
amplitude of the Fourier transform of the signal and the sum of the squares of the width 
and the pulse jitter parameter determine the rate of the reduction in amplitude of the 
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Figure 2. A example of Fourier transformed "search-mode" data showing a portion 
of a simulated dataset consisting of a strong signal in Gaussian noise. The panels on 
the left show (in blue) the real part of the complex Fourier transform of the data as 
a function of Fourier frequency for 8 1-MHz wide frequency channels. The imaginary 
parts are shown (in red) on the right. The dataset used to generate this plot is identical 
to that shown in Fig. [T] and we have truncated the frequency range at 2 kHz since there 
the harmonic content of the signal is significantly reduced beyond this frequency. 



harmonics with increasing frequency. Using the data to measure this amplitude and its 
attenuation with increasing frequency will therefore not allow us to constrain all three 
parameter^] 

4. Using folded data 

The majority of pulsar timing data is pre-processed and reduced in volume by the process 
of folding. In this process, sections of the time series from each frequency channel of an 
observation will be folded with an assumed pulse period^ At the time of folding this 

+ We note that strictly speaking it would be possible to identify the values of all three parameters 
for a very strong signal. Pulse arrival time jitter acts to remove a small fraction of power from the 
harmonics and distribute it amongst the inter-harmonic frequency bins. Our analysis is restricted to 
localised regions at each harmonic and so we treat this information as lost. 

* The folding procedure can also include de-dispersion over a limited range of frequency channels where, 
just as with folding, an assumed value of the dispersion is used. Hence a large number of frequency 
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pulse period will not necessarily be the most accurate value. The pulse period itself is 
updated and refined with each subsequent observation. However, once data have been 
folded, most notably for older observations, the original search mode data may be lost, 
meaning that re-folding with the more refined period is not possible. 

We will focus on the effect of folding with an inaccurate pulse period. One can 
argue that since the most basic initial pulse period estimates will require a coherent 
measurement over some prior observation spanning many pulses, we should expect an 
initial worst case fractional uncertainty in the pulse period of ~ P/T which for a 10 
msec pulsar period and a 100 second coherent observation equates to a period error of 
~ 1 /^second. In addition to the pulse period, for realistic analyses a number of other 
parameters are used in the folding procedure such as the sky position coordinates, 
the intrinsic pulsar spin-frequency derivatives, the dispersion measure plus orbital 
parameters if the source is in a binary system. In our toy model we ignore these 
complications. 

We choose to define the result of the folding process for a single observation as a 
2-dimensional grid of pulse profiles labelled by time and channel frequency, an example 
of which is shown in Fig. [3} To perform a consistent analysis of such a dataset we 
take into account the fact that profiles have been obtained using a non-precise value 
of the pulse period. If we consider a dataset that has already been folded at a specific 
(non-exact) pulse period P' — P + AP then we can define a new folded dataset as 

n-1 

X(<f>', P', /*)=£> + <f>')P', fk) , (15) 

/9=0 

where (3 indexes each fold up to n = floor {T/P'). Substituting in our signal model 
(Eqs. [2] and |3| we can accurately approximate the discretely summed noise- free pulse 
profile as 

W P\ h)^fiw\\[\il t erf (fi, + b )~ erf (««)] » ( 16 ) 

I I ' 2=-l 



where we have used 



I AP| 
AP 



(P + AP)(0' + ^)-0 fc P 



(17) 



\AP\(n - 1) 



«v\'2 

In the calculation of Eq. [16] we have again replaced the pulse arrival time jitter term with 
its expectation value (as done in Sec. [3j and approximated the sum over pulses with a 
continuous integral. We have also been forced to re-parameterise the pulse amplitude 
and width parameters for the same reasons as described in the previous section and 
have chosen to use an identical re-parameterisation (defined in Eqs. 13 and 14). The 
summation over the index z is simply to account for the fact that folding a signal with 

channels can be grouped together into a single pulse profile measurement. We do not consider this 
potential feature of the folding procedure in this work. 
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Figure 3. An example simulated folded dataset showing folded pulse profiles for 4 
sub-intervals each spanning 25 seconds and for 8 frequency channels each spanning 1 
MHz. The simulated signal parameters are equal to those defined and used in Figs. [T] 
and [2] with the exception that here the signal amplitude A = 0.1 is significantly lower. 
Two curves are plotted in each panel, the blue curves are profiles obtained through 
folding with the true pulse period P. The red curves are the profiles obtained through 
folding with an pulse period error A_P' = 10 nsec. Note that this size period error is 
equivalent to a phase error of ~ 0.01 cycles over the course of a sub-integration. 



an arbitrary initial phase may separate the pulse profile into significant contributions 
spanning the <ft' = = 1 point. This also acts to account for the fact that if folding 
with an incorrect pulse period the true pulse will slowly drift across the <j)' space. In 
this scenario the tails of neighbouring pulses begin to contribute to the sum and by 
including the z = ±1 terms we are accurately modelling this effect. 



5. A Bayesian analysis 

The Bayesian component to our approach can be viewed as standard in the sense that 
we aim to simply apply Bayes probability theorem to the time-of-arrival problem with 
the intention of computing marginalised posterior probability distributions on the signal 
parameters. 
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Bayes theorem can be expressed as 




(19) 



where the term on the left-hand-side is the joint posterior probability distribution on the 
parameter set 6 given a dataset defined by the vector x and a chosen model represented 
by Ai. The function L(x\0,M) is the likelihood function describing the dataset x 
given the parameter set 6 and the model M.. The function -k(0\M) is the joint prior 
probability distribution on the parameter set 6 given the model M.. Finally we have 
the Bayesian evidence E(Ai\x,6) representing the probability of the model M. given 
the dataset x. 

To obtain marginalised posterior distributions on a particular signal parameter we 
are required to perform a multi-dimensional integration of the joint posterior distribution 
over the remaining parameters. Formally this can be written as 



where the parameter vector 6' consists of the subset of parameters in the vector 
excluding the parameter 9 m and where S defines the volume of integration on that 
space. Note that there is no dependence upon the Bayesian evidence in the calculation 
of the marginalised posterior distribution since it is independent of the parameter values 
themselves and can be absorbed into the normalisation of the posterior distribution. 

In practice the calculation of posterior distribution functions can be a difficult 
and computationally intensive procedure. Over the last decade much work has been 
dedicated to the efficient numerical computation of posterior probability distributions 
and more recently to the evaluation of the Bayesian evidence. One of the now 
standard tools available for Bayesian data analysis is the Markov-Chain-Monte-Carlo 
(MCMC) [TJJ [18], an efficient method for obtaining random samples drawn from a 
posterior probability distribution of which there are a number of variations [T9"| 1201 
EU [22j [23l Elj. More recently the strategy known as "nested sampling" [253 EE] has 
given the data analyst the ability to accurately estimate the Bayesian evidence, a model 
dependent quantity used to perform model selection. The first direct application of 
this strategy was to perform cosmological model selection using WMAP data [27] . For 
this work we chose to perform our analysis using the freely available nested sampling 
algorithm MultiNest [28]. Note that this algorithm has been specifically designed to be 
robust with respect to multi-modal posterior distributions and to compute the Bayesian 
evidence. For this work we use it purely to obtain posterior probability distributions on 
the pulsar parameters. 

Let us now define the likelihood functions specific to the two approaches described 
in Sees. [3] and [4| The likelihood function for the Fourier domain approach to the "search- 
mode" data is defined as 




(20) 




(21) 
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where N/2 and M are the total number of Fourier-frequency and radio-frequency bins 
respectively and we define = {A^, w^, DM, P, $ } as the vector of signal parameters. 
We have used er? to represent the frequency domain noise variance which we assume 
to be Gaussian, white, and stationary and therefore constant for all Fourier and radio 
frequency bins. In this ideal scenario the frequency domain noise variance is related to 
the time domain noise variance erf by o~| = iV(At) 2 of . 

The likelihood function for the folded data can similarly be written as 



where N s is the number of equal length sub-intervals into which each frequency channel's 
timeseries has been divided. The noise contribution in a particular folded phase bin is 
simply the sum of n = floor(T/P') Gaussian distributed variables of variance of and 
hence a\ = na 2 . The parameter vector 6 is identical to that defined for the search-mode 
data. 

In general the choice of prior probability distribution functions on the parameters 
would be chosen according to one's prior beliefs on the values of those parameters. 
However, for the purposes of our toy model investigation we choose "flat" prior 
distributions for all parameters with prior ranges chosen to be far greater than the 
expected span of the posterior distributions. In this case we do not favour any particular 
choice of parameter values over any others. We note that in making this choice we 
are disregarding a powerful feature of the Bayesian analysis, the ability to correctly 
incorporate prior information into the result. However, one can show that for strong 
signal-to-noise ratios the effect of the prior on the posterior is dominated by that of the 
likelihood function itself. 

To conclude this section we would like to make it clear that what we have described 
in Sees. [3] and [4] do not constitute two separate models. We have described two separate 
representations of the same original dataset and have in-fact used the same signal model. 
Model selection therefore could not be applied to these two methods. Our aim is to 
compare the effectiveness of each choice of dataset representation by contrasting the 
posterior distributions on the signal parameters when a single common time-radio- 
frequency dataset is used to generate both the Fourier-radio-frequency and a folded 
dataset. Model selection using the Bayesian evidence and the computation of the Bayes 
factor (the ratio of model evidences) and odds-ratio (the Bayes factor multiplied by 
the ratio of prior model probabilities) is a potentially powerful tool in future advanced 
implementations of our analysis strategy. Our choice of nested sampling implementation, 
MultiNest, has been designed specifically to compute the Bayesian evidence, making 
model selection between different pulsar signal models an obvious and easy to implement 
extension of our approach. 
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Figure 4. The marginalised posterior distributions on the signal parameters for a 
simulated signal in Gaussian noise. In solid blue we show the results obtained when 
using the Fourier domain representation of the "search-mode" data. In solid red we 
show the results obtained when using folded data as the input for the case where the 
data was folded with the true pulse period. In dashed red we show the results for the 
folding scenario where an incorrect period, AP — 10 nanoseconds, has been used to 
fold the data and we have accounted for this within the signal model. The dashed 
black curves show the result where this effect has not been accounted for. The vertical 
dotted black lines indicate the values of the true signal parameters. For all results the 
data was converted into the Fourier and folded representations from a single common 
time-frequency dataset of length 100 sec with sampling time 64 /isec and frequency 
range of 8 MHz consisting of 8 channels each of 1 MHz bandwidth. 

6. Discussion 

Shown in Fig. [4]is an example of typical marginalised posterior probability distributions 
on the signal parameters 6 = {A^, w^, DM, P, $ } plus the time-of-arrival parameter 
tTOA- The latter is not independent of the other parameters and is a function of both 
the phase parameter and the pulse period such that ^toa = ^o-P an d is therefore 
defined as the arrival time of the first pulse received at the mid-point frequency channel 
immediately following the mid-point of the observation. 

Our results show that the ability to determine the signal parameters is unaffected by 
the choice of data representation when comparing the Fourier domain approach and the 
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folded data. This is apparent from the consistent widths of the posterior distributions 
which define the uncertainty in parameter estimation. The clear effect that we see is 
the discrepancy between the location of the posterior distributions for the case where 
the error in folding period has been accounted for and where it has not. We see that the 
estimation of the dispersion, pulse amplitude, pulse width, and pulse frequency is only 
marginally affected. However, the phase parameter and therefore the time-of- arrival 
estimate, is strongly biased by the false assumption that the signal has been folded with 
the correct pulse period. For the results shown in Fig. [4] the pulse period error of 10 
nsecs is equivalent to an accumulated phase error of only 3.6 degrees over the length of 
the 25 sec sub-integrations. This appears as a ~ 1.8 degree (= 0.005 cycles) error in 
the estimate of $o leading to a ~ 25 /isecond error in the estimate of the time-of-arrival 
value^J 

It is clear that the work presented here is intended only as a potential starting point 
for more advanced applications of Bayesian data analysis techniques to the problem of 
pulsar timing. A clear difference between our approach as described here and established 
techniques is that we have obtained our pulsar parameter estimates from a single 
simulated observation. The standard approach is to employ a more global strategy 
in which the process of producing a time-of-arrival measurement for a given observation 
is not just a function of the given observation but of all existing observations of the 
pulsar. Each TOA represents the reduction of an entire observation into a single number 
after having performed a global fit (over all observations) for a set of common pulsar 
parameters e.g. pulse period, the period derivatives, the sky position, proper motion, 
pulse shape parameters, the dispersion measure, etc. When new observations are taken, 
the procedure is repeated and these parameters are refined. As discussed in Sec. |4| as 
data is recorded it is often reduced (in terms of data volume) by folding at an assumed 
pulse period and in addition may be partially de-dispersed with an assumed dispersion 
measure. The detrimental effect of this process (as seen in our results) will rapidly 
diminish as more and more observations are made but further analysis is required to 
rule out such effects as contributors to the low-frequency timing noise seen in the msec 
pulsars. 

The scope of this work is limited to the generation of TOAs but we would also like to 
briefly discuss the specific aim of gravitational wave detection using pulsar timing arrays. 
From a purely theoretical Bayesian data analysis perspective in an ideal scenario, firstly, 
one would use an un-reduced dataset spanning all observations of all relevant pulsars. 
Secondly one would construct a model including all pulsar signal parameters and all 
gravitational wave signal parameters. After applying sensible prior distributions to all 
of these parameters one would compute marginalised posterior distributions on both 
pulsar and gravitational wave parameters and perform model selection. We could then 
establish whether the observations coupled with our prior beliefs were consistent with 

jj This observed phase error is half of the total accumulated phase error because the phase parameter 
value is defined at the midpoint of the observation and therefore the phase error effectively accumulates 
over T/2 rather than T. 
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the presence of gravitational waves. In practice this is a very difficult task for various 
reasons but most notably due to the vast computational resources required to process 
the vast volume of un-reduced data and to explore the multi-dimensional parameter 
space describing the entire pulsar array and the intervening gravitational wave. For 
this reason, in terms of gravitational wave detection, constructing a reduced dataset 
is highly desirable. In fact, the problem of gravitational wave detection using timing 
residuals (the difference between the time-of-arrival values and those attained by fitting 
a gravitational wave free pulsar model) as the initial dataset have already been applied 
to the specific case of searching for the gravitational wave stochastic background [TTJ [10] . 

The apparent separation of the complete gravitational wave detection problem into 
a gravitational wave free component, from which a reduced dataset is produced, and 
then a second component in which this reduced dataset is then analysed including the 
effects of gravitational waves, seems potentially problematic. Under the assumption 
that each TOA measurement is independent of all others one can argue strongly that 
the effect of a low-frequency gravitational wave on each measurement is negligible and 
that the TOA truly represents the unambiguous arrival time of an average pulse within 
that observation and defined at some epoch. As soon as one performs a global fit 
(neglecting gravitational waves) over all observations of a given pulsar a gravitational 
wave of sufficient amplitude will affect the best fit pulsar parameters. Such a procedure 
could absorb some fraction of a gravitational wave into the pulsar parameter estimates 
(e.g the pulsar period derivatives). In future work we hope to address this issue and to 
provide a comparison between an analysis using independent TOA measurements as a 
dataset for gravitational wave detection and an analysis using globally estimated TOAs. 

In addition we hope to be able to include, and account for, many of the physical 
effects and data analysis issues that we have ignored in our toy model approach. These 
include a more robust treatment of the noise where we allow time and frequency variation 
and investigate the validity of the assumption of Gaussianity. In reference to this we hope 
to also include the effects of radio frequency interference (RFI) and investigate methods 
in which we are able to analytically marginalise over the noise and therefore potentially 
avoid the need to estimate it. We also aim to include the effect of polarisation into the 
analysis. A search-mode dataset is itself the product of two independent radio signal 
polarisation measurements which are combined as a function of the Stokes parameters. 
These parameters can be incorporated into the Bayesian framework and uncertainties 
on these parameters can be marginalised over in parallel with the signal parameters. 
Less well defined effects to consider include a time and frequency varying pulse profile 
parameterisation, time varying dispersion measure, scattering, scintillation and nulling. 
Finally, we hope to develop this work beyond the toy model to a point at which it 
can be applied to real pulsar data. In such a scenario we will also have to incorporate 
barycentric routines [6] to include the obvious effects of detector motion, sky position 
uncertainty and, where applicable, binary orbital motion. 
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